Force Unfolding Single RNAs: from Equilibrium to Far-from Equilibrium 
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We summarize the recent simulation progress of micromanipulation experiments on RNAs. Our 
work mainly consults with two important small RNAs unfolding experiments carried out by 
Bustamante group. Our results show that, in contrast to protein cases, using the single polymer 
elastic theory and the well known RNA secondary structure free energy knowledge, we can 
successively simulate various behaviors of force unfolding RNAs under different experimental 
setups from equilibrium to far-from equilibrium. Particularly, our simulation would be helpful 
in understanding Jarzynski's remarkable equality, which its experimental test has received 
considerable attention. 
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I. INTRODUCTION 

Ribonucleic Acid (RNA) is now known to be involved 
in many biological processes, such as carriers of genetic 
information (messenger RNAs) , simple adapters of amino 
acids (transfer RNAs), and enzymes catalyzing the reac- 
tions in protein synthesis, cleavage and synthesis of phos- 
phodiester bonds In particular, recent discoveries 

indicated that a class of RNA called small RNA operates 
many of cell's control (for a report, seeQ). These di- 
verse and specific biological functions of RNA are guided 
by their unique three-dimensional folding. Therefore, 
prediction or measurements of RNA folding and folding 
dynamics becomes one of central problems in biological 
studies. 

In addition to standard experimental methods such 
as X-ray crystallograph and NMR spectroscopy, single- 
molecule manipulation technique developed in the past 
decade provides a fresh and promisingway in resolving 
the RNA folding problem l|a la 13 □> Is @) • As a concrete 
example, an optical tweezer setup is sketched in Fig. 
ijlftllll) : a single RNA molecule is attached between two 
beads with RNA:DNA hybrid handle; one bead is held 
by a pipette, and the other is in a laser light trap 1 . By 
moving the position of the pipette, the distance between 
the two beads and the force acting on the bead in the 
light trap can be measured with high resolution. This 
sophisticated setup has showed its abilities in recording 
the time-traces of the end-to-end distance of a small 22- 
basepair RNA hairpin (0), and resolving complicated un- 
folding pathways of 1540-base long 16S ribosomal RNA 

On theoretical side, although complete three- 
dimensional RNA folding prediction so far seems enor- 



1 In practice, the RNA is attached between the two beads with 
two RNA:DNA hybrid handles. To simplify simulation method, 



only one handle is considered, 
discussions. 



It should not change following 



mously difficult l)12D . RNA structural prediction from 
physical point of view has made great progress on the 
level of secondary structure ifla Il4t Ua) . The advent 
of the single-molecule experiments addresses a challeng- 
ing issue for theorists: whether or how can we apply the 
known secondary structural RNA knowledge to explain 
or predict the phenomena observed in the single-molecule 
experiments? Under force stretching or twisting, the elas- 
tic properties which were cared little or even neglected 
before now must be seriously took into account. 

Many theoretical efforts have been devoted to under- 
stand force unfolding RNAs <E [H H E HJ EHH . 
However these theories or models are too simple to be ap- 
plied in experiments; useful free energy data about RNA 
secondary structure obtained before were often neglected. 
Moreover, they just studied equilibrium cases, while in- 
triguing nonequilibrium phenomena were beyond their 
scopes. Simulation method should be a good choice to 
overcome these shortcomings. But we noted that, com- 
pared to enormous simulation works about force unfold- 
ing proteins iH IH IH |22l) , the simulations for RNAs 
are few (|5|) though biological importance of the later is 
the same as the former. To fill this gap, our group de- 
veloped stochastic methods and applied it to investigate 
the interesting force unfolding single RNAs experiments 
l|2ll l29l). In this paper, we will summarize our previous 
effort and extend them to investigate more intriguing is- 
sue, the remarkable Jarzynski's equality (30), which its 
experimental test has attracted considerable attention. 



II. MODEL AND METHOD 

A. RNA folding without force 

A RNA sequence is denoted by a nucleotides string 
I = (xi, X21 x n ),Xi <G {A, U, C, G}; the bases x\ and 
Xji are the nucleotides at 5' and 3' end of the sequence, 
respectively. A secondary structure S of a RNA sequence 
is a list of base pairs [xi,Xj] that must satisfy two con- 
ditions: every base forms a pair with at most one other 
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FIG. 1 Sketch of an optical tweezer setup and the RNA 
molecules studied in the work. We denote the region between 
the two arcs as the optical trap. RNA molecules are attached 
between the two beads (larger black points) with a RNA:DNA 
hybrid handle (the black dash curves) . The center of the light 
trap is moved with velocity v. Here the total distance at time 
t is z(t) = x tw + x ds + x 3S . The individual extensions, x tw 
the position of the bead with respect to the center of the op- 
tical trap, x da the end-to-end distance of the double-stranded 
DNA (dsDNA) handles, and x as The end-to-end distance of 
the single RNA are freely fluctuated. The RNA native struc- 
tures for the three small RNA sequences, P5ab, P5abcAA, 
and P5abc are folded by Vienna Package 1.4. 



base, and if any two base pairs [x%, Xj] and [xk, x{\ are in 
the list, then i < k < j implies i < I < j. All structures of 
the sequence I comprise a set, S(l) — {So, S%, S m , 0}, 
here denotes the completely open chain conformation. 

In order to describe the folding or unfolding process as 
a time-ordered series of the structures in the set S(l), a 
relation M which specifies whether two structures are ac- 
cessible from each other by an elementary "move" must 
be reasonably defined. The definition is identical to spec- 
ifying a metric in the set S(l). Any secondary struc- 
ture formation or dissolving hence can be described by a 
succession of elementary steps chosen according to some 
distributions from a pool of acceptable moves in the con- 
formational space C(l) = {S(l),M}. In the absence of 
mechanical force, two kinds of move sets have been pro- 
posed in modelling secondary structural RNA fol ding : 
one is the formation or decay of a single helix <|32t 1331 ). 
and the other is the removal or insertion of single base 
pairs per time step We make use of the latter, 

for it is the simplest move set on the level of secondary 
structure. Moreover, we mainly focus on smaller RNAs. 
The formation or removal of a helix may cause larger 



structural changes, while its physical relevance of RNA 
folding or unfolding seems debatable. 



B. RNA unfolding under mechanical force 

According to the difference of the external controlled 
parameters, the RNA unfolding experiments can be car- 
ried out under constant extension and constant force, i.e., 
the constant extension and the constant force ensembles 

We first consider the constant extension ensemble. Fig. 
^is the sketch of an optical tweezer setup for this ensem- 
ble. The position of the center of the light trap is moved 
according to a time-dependent relationship z(t) — z Q +vt, 
where z(t) is the distance between the centers of the light 
trap and the bead held by the micropipette, z Q is offset 
at time t = and v is the constant velocity. We suppose 
that the changes of the extensions of RNA and the han- 
dle proceed along one direction, and the physical effect of 
the beads is negligible. Any state of the system at time t 
then can be specified with three independent quantities, 
the extension of the RNA x 3S , the end-to-end distance 
of the handle x ds , and the RNA secondary structure S, 
i.e., the system in i-state at time t (Si, x ds , xf s ) t - We do 
not include x here for the sum of individual extensions 
satisfies the constraint condition, z(t) = x tw + x ds + x ss . 
Hence, the unfolding of the single RNA proceeds in an 
conformational space C(l) = S(l) x R ds x R ss , where 
R ds = (0, Ids) and R ss = (0, l ss ) and , and Ids and l ss are 
the contour lengths of the handle and the RNA molecule, 
respectively. In order to describe this process as a time- 
ordered series of the conformations in the space, a re- 
lation M which specifies whether two conformations are 
accessible from each other by an elementary "move" (or 
neighbors) must be reasonably defined. We propose the 
following move set J2fi * 



(Si, x,i s , xl s )t — > (Sj, x i s , xl s )t> ,i 7^ j 
(S i ,xt s ,xr)t^(Si,xf s TS,x s i s ±6) tl 

(s i ,xt s ,xr)t^(s i ,xi s ±6,xr)t'. 



(1) 



The first kind of the moves is the removal or insertion 
of single base pairs while fixing the extensions x ds and 
x ss . The other two kinds are to respectively move the 
positions of the end of the handle and the end of single- 
stranded RNA with a small displacement S, while the 
secondary structure is fixed simultaneously. 

Given the system state i at time t, the systematic en- 
ergy can be written as 

Ei(t) = AG" + u(xl w ) + W ds (x ds ) + W ss (xt s ,n t ), (2) 

where AG? is the free energy obtained from folding the 
RNA sequence into the secondary structure Si, and the 
elastic energies of the optical trap, the handle, and the 
single-stranded part of the RNA are 

u(xf) = \k tw xT(tf, 
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W ds (xf s ) = / f ds (x')dx' 





(3) 



^"(af'.ni) = xl s f(xi s ,m)- J x ss (f \ ni )df ', 

o 

respectively, and x- u '(i) = z{t) — xf s — xf s . In the ex- 
pression W ds , fds(x') is the average force of the handle 
at given extension x', 



fds{x') = 



1 



k B T 



P ds \A{l~x'/l ds y 



and Pd s is the persistence length of double stranded han- 
dle. In the expression W ss , x ss (f',ni) is the average 
extension of the single stranded part of the RNA whose 
bases (exterior bases) are m at given force /', 



= nibs 



tip 
[coth(i-%) 



(5) 



where b ss and P ss are the monomer distance and the 
Kuhn length of the single-stranded RNA, respectively 
ifTol : l36|) . Note that f{x s i s 1 n i ) is the inverse function of 
Xssif ') n i); an d the contour length of the RNA l ss = b ss n. 
The light trap here is simply assumed to be a harmonic 
potential with spring constant k tw . Hence the loading 
rate is r = kt w v. 

In the real experiments, constant force can be im- 
posed on RNA molecules with feedback-stabilized optical 
tweezers capable of maintaining a preset force by mov- 
ing the beads closer or further apart. Including the feed- 
back mechanism in theoretical study is not essential now. 
Therefore the energy of tweezer in Eq. [21 is replaced by 
-/ x (xf* + xf* ). 



C. Continuous time Monte Carlo algorithm 

Given the move sets and the unfolding conformational 
spaces, the RNA unfolding for the two ensembles can be 
modelled as a Markov process in their respective spaces. 
Following conventional stochastic kinetics of chemical re- 
actions, these processes are described as the master equa- 
tion, 



dPjjt) 
dt 



3=0 



Pi{t)h 



(6) 



where p (t) is the probability of the system being i-state 
at time t, and k%j is the transition probability from i-state 
to j-state. 

The form of the master equation looks relatively sim- 
ple, however it is mathematically intractable to solve ana- 
lytically for simple "reaction" system such as RNA P5ab. 
Previous work has demonstrated that a continuous time 
Monte Carlo simulation is an excellent approach toward 



the stochastic process described by Eq. (2 l38i) . 
As a variant of the standard Monte Carlo method, the 
continuous time Monte Carlo (CTMC) method is very 
efficient and fast because of lacking of waiting times due 
to rejection. In contrast to standard MC method, instead 
of the MC step used to approximate the real time, the 
"time" in Gillespie's method could be real if the tran- 
sition probabilities were calculated by first principles or 
empirically. 

The key formula in CTMC is that, given the sys- 
tem at i-state at current time t, the probability density 
p(j, t'\i. t) that the next state is j and it occurs at time t' 
is 



pti,t'\i,t) = kf j e X p(- J E fc S dr )' 



(7) 



where kj k are transition probabilities from the i-state 
to the neighbouring fc-state at time r, which can be 
time-dependent or time-independent (no parameters r) 
I3H l39l) . and the sum is over all neighbors of i-state. Ac- 
cording to Eq. [7| then the time t' for the next state to 
occur can be obtained by solving the following equation, 



n = exp(- 



(8) 



where r± is a uniform random number in the interval [0,1] . 
For time-independent situation, the equation reduces 
to most common expression r\ = exp[— (t' — t) y\ ku] . 
While if the transition probabilities involve time t, then 
numerical methods for integration and root finding have 
to be applied (flol) . Then the next state j is chosen if 
another uniform random number < Xw=i I ■ 
We assume that the transition probabilities satisfy the 
symmetric rule Ell) 



K ij 



■exv{-P{Ei{t)-Ei{t))l2), 



(9) 



where r Q scales the time axis of the unfolding process 
from the experimental measurements. Apparently, the 
transition probabilities satisfy the detailed balance con- 
dition. 



D. Partition function method in equilibrium 

If the moving velocity of the light trap vanishes, an 
exact partition function method can calculate the molec- 
ular average extension and the average force under the 
given distance z (|20r> . The method is an extension of the 
partition function method proposed for RNA secondary 
structural prediction l)14ri . Different from the experimen- 
tal measurements of the free energy with slow pulling 
velocity (quasi-equilibrium process) (0), we obtain the 
equilibrium information by this exact method. Consider- 
ing coincidences of formulae and new physical quantities 
needed, we rewrite the formulae in Ref. <|2fJ|) . 
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The key idea of the exact method is that the partition 
function over all secondary structures of a given RNA can 
be calculated by dynamic programming Given the 

partition function Q{i, j, m) on the sequence segment [i,j] 
with exterior bases m, its recursion formula is as follows, 

Q(i,j,m) = 15 mjj - i+ i + qb(i,A + j-m) 

j-l k-i+l 
k=i 1=1 

xqb(k + l,l + A + j-m), 

where A = 2, the partition function qb(i,j) on the se- 
quence segment [i,j] for which the i and j bases are paired; 
Vienna Package 1.4 provides the calculation codes lf42l) . 

For the constant extension ensemble, let the partition 
function of the RNA molecule at extension x (including 
the handle) be Z n (x). Then the function can be written 
as 

" fids pmb ss 

Z n (x) = V / / dx ds dx ss 6{x-x ds -x ss ) 

Q(l,n,m)exp(-PW(x ds ,x ss ,m)) (11) 

where W(x ds , x ss : n) = W ds (x ds ) + W ss (x ss , n). The 
molecular free energy landscape along x then is G (x) = 
— In Z n (x). To calculate the average force (/) and the 
average extension (x) at given distance z, the systematic 
partition function Z n (z) required is 

Z n {z) = j dxZ n (x) exp(— (3u(z ~ x)). (12) 
Jo 

Then the systematic free energy G(z) = — /3" 1 lnZ„, 
(/) = dG(z)/8z and (x) = z - (f)/k tw . 

While for the constant force ensemble, the partition 
function Z n (f) under given force / is 

Z n (f)= dxZ n (x)exp{pfx), (13) 

Jo 

and the average extension (a;) = j3~ 1 d\nZ n /df. 

E. Parameters and measurement 

We simulate the single RNA folding and unfolding un- 
der mechanical force at the experimental temperature 
T = 298K. The elastic parameters used are: Pd s = 53 
nm, Ids — 320 nm, b ss — 0.56 nm, P ss =1.5 nm, and 
ktw — 0.2 pN/nm. We use the single-stranded DNA pa- 
rameters for the single stranded part of RNA because 
they have similar chemical structure. The displacement 
5 = 0.1 nm. The free energy parameters for the RNA 
secondary structures are from the Vienna package 1.4 
f42T) in standard salt concentrations [iVa + ] = 1 M and 
[Mg 2+ ] = M. In addition to the standard Watson-Crick 
base pairs (AU and CG), GU base pair is allowed in our 



simulation. Formation of an isolated base pairs is forbid- 
den because of their instability. In the constant extension 
ensembles, the force fi acting on the RNA molecule at i- 
state is calculated by fi — k tw xl w , and the bead-to-bead 
distance x\ b = x ds +xf s . In the constant force ensemble, 
the extension of the molecule is just x^. 



III. RESULTS AND DISCUSSION 
A. Single RNAs thermodynamics 

A comparison between our simulation in equilibrium 
and the prediction of the exact partition function method 
should be helpful in confirming the correctness of our 
method. We simulate the average force-extension curves 
of the three RNA molecules for the two ensembles with 
standard approach: the average physical quantity A 
is calculated according to (A) = t _1 A(t)dt, here 
t = 10 6 ; see Fig. [2] The force-extension curves cal- 
culated by the exact method are plotted with different 
kinds of curves. We can find that these two independent 
calculations agree very well. 

Although the two methods quite consist each other, the 
values of the unfolding forces have apparent discrepancies 
with experimental measurements. For example, in the 
absence of Mg 2+ the values are 13.3, 11.3 and 8.0 pN for 
P5ab, P5abcAA, and P5abc molecules for the constant 
extension ensemble, respectively (jg). It is not strange 
because we do not include the effect of ionic concentra- 
tion in our model. Hence we choose a reasonable ionic 
correction of RNA free energies f43T) . Unfortunately, we 
still do not get good results; see Table I. There are tow 
possible causes leading to such discrepancies. One is that 
the ionic corrections or free energy parameters for RNA 
are not precise enough; they cannot be used in the force 
unfolding cases. The other is that polymer elastic pa- 
rameters are not very precise. We prefer to the later. In 
addition to RNA free energy measured and tested for al- 
most forty years, the persistent length of ssDNA in ionic 
environment is still debatable lf45h . For instance, we cal- 
culate the unfolding forces of the three molecule with 
P ss = 2.2 nm and indeed find that they are closer to the 
experimental values. As a further demonstration, we also 
list other values measured in previous experiments and 
compare them with theoretical predictions in the same 
table. 



B. Single RNAs kinetics 

1. Constant extension ensemble 

Force-extension curves. As an example, we stretch 
P5ab molecule with the velocity v = 5 x 10~ 3 A from the 
offset z a = 350 nm to z(t) — 450 nm, and then relax it 
with the same velocity; here we let r = 1. One of the 
time trajectories is showed in Fig. [2JA Apparently, the 
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TABLE I The unfolding forces /„ of different molecules under different experimental conditions. The experimental data are 
from the previously published data EH) ■ The theoretical values are from the exact numerical methods developed above, 

where f^, i = 1, 2, 3 represent the unfolding forces without the ionic correction, with the ionic correction on the free energy and 
with the ionic and the persistent length corrections, respectively. Here We do not show the P5abc unfolding force for it is not 
reversible in Mg 2+ due to the presence of tertiary interactions. 



Molecule temperature (K) Na+ QM) Mg 2+ (mM) fj (pN) f% (pN) fl (pN) f£* (pN) 
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P5abcAA 
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22.4 


17.0 


poly(dC-dG) 


293 


150 





25.1 


23.8 


21.7 


20.0 



unfolding and refolding trajectories are not coincident, 
i.e., force- hysteresis, which indicates that the molecule is 
driven from thermodynamic equilibrium (|6j) occurs. 

In experiments record of the force and extension at 
given times with a slow velocity is a more common 
method in the equilibrium measurement. Hence we 
simulate the three curves with the two slow velocities 
1 x 10 _4 A and 1 x 10~ 5 A. Because enormous data would 
be generated if the time trajectories were recorded, we 
only show the data per unit times 10 5 and 10 6 (see Figs. 
0P and D). For the faster velocity we find that, except 
P5ab case, the unfolding forces for the others do not equal 
the equilibrium values; whereas for the later, the curves 
of simulations consist with the exact curves. It means 
that the unfolding of the three molecules with 1 x 10 _5 A 
is or near equilibrium. Two features in Fig. are of 
our interest. Compared with the curves obtained by the 
time averaging, the curves recorded at time points are 
very rough even before and after the unfolding. And al- 
though the whole extension z(t) monotonically increases 
with time, the extensions of the molecules may still jump 
between two values, such as P5ab and P5abc molecules. 
Indeed similar phenomena were also observed in the ex- 
periment (|6() . They indicate the fluctuations of the exten- 
sion and RNA structures under the force. Just because 
of this observation, and that the phenomena are not rare 
in simulation and the experiment. 

In the experiment the unfolding P5abc are near- 

equilibrium and far from equilibrium at the loading rates 
2-5 pN/s and 34-52 pN/s, respectively (similar values for 
P5abcAA). And our simulations also show that the un- 
folding the same molecule are near-equilibrium and far 
from equilibrium at the velocities 10 -5 A and 10 _4 A, re- 
spectively. Let them be equal correspondingly we then 
can estimate the constant t d 10~ 7 s. We will scale the 
time with this parameter below for convenience. 

Fig. 0] shows 100 trajectories with two loading rates 
for P5ab and P5abc molecules. The trajectories are 
stretched from the same offset z a — 350 nm after ther- 



mal equilibrium until the terminal extension z — 450 nm. 
For both the loading rates, below and above the unfold- 
ing forces, the force-extension curves are dominated by 
the double-stranded handles. But the values of the un- 
folding forces apparently fluctuate and dependent on the 
rate and the molecular type. When the pulling speed 
is faster, or the loading rate is larger (1000 pN/s), the 
average unfolding force increases correspondingly. This 
phenomenon has been theoretically predicted earlier l|46l) . 
We note that at the same loading rate, the trajectories of 
P5ab are closer to its equilibrium force-extension curve 
than the trajectories of P5abc. It means that the relax- 
ing process of the former is faster than the latter. This 
fact has also been observed in the experiment (6). 

Free energy reconstruction. According to Ref. 
l)47lh the unperturbed molecular free energy landscape 
G {x) along the molecular extension x can be calculated 
from position-versus-time curves with the expression 

G o (x)-G(t = 0) = (14) 
-/3- 1 log^(s-x(i))exp(A Wt )} 

where Aw t = w t — k tw {x{t) — vt) 2 /2, G(t = 0) is the free 
energy of the whole system in equilibrium at initial time 
t = 0, and 

Wt = k tw v(vt 2 /2 + z a t - I x{t')dt') (15) 

J o 

The free energy landscape can be reconstructed by one 
time slice according to Eq. ^] But considering that 
for finite stretching trajectories, we only sample a small 
window around the molecular equilibrium position at the 
whole extension z(t), Therefore, a weighted histogram 
method was proposed lj4^h 

G (x) - G{t = 0) = (16) 

v-> (S(x—x t ) cxp {-(3w ti )) 



ti (cxp(-/3u) ti )) 
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FIG. 2 Comparison of the exact and simulation force- 
extension curves in equilibrium for P5ab, P5abcAA and 
P5abc on the constant extension (A) and the constant force 
(B) ensembles. The different symbols are from the simulation 
methods, and the different lines are from the exact methods. 
They agree with each other very well. Inset, force-extension 
curves for the same ensemble recalculated by another move 
set. 



where the sum is over many time slices t' , and the average 
is over the repeated trajectories at each given time slice. 
For each trajectory, we choose the discrete time i$ = iAt, 
i = 1, • ••, 100, here At = 10/v, i.e., the time moving the 
light trap 1 nm (or every point in Fig. 0J. 

Fig. shows the finally reconstructed free energy 
landscapes for the two molecules at two loading rates 
20 and 40 pN/s. The precisions of reconstructions are 
satisfactory. We note that landscapes are unexpectedly 
trivial: neither of them presents energy barrier. Ref. 
fish has investigated Jarzynski's equality by modelling 
RNA molecules as a two-level system with an intermedi- 
ate barrier. Our calculations apparently contradict their 
assumption. Indeed, the strong unfolding-refolding coop- 
erativity observed in the experiments (jg; 0) arises from 
the coupling of the RNA molecules and the light trap; 
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FIG. 3 A. One of the time trajectories of unfolding and re- 
folding for P5ab with velocity 5 x 10 _3 A. Force-hysteresis is 
observed. B. Fig. 2 A is showed here again. C. The unfold- 
ing force-extension curves recorded at unit time 10 5 for the 
molecules with velocity 1 x 10 _4 A. D. The force-extension 
curves recorded at time unit 10 6 with velocity 1 x 10 _5 A. 
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FIG. 4 Unfolding trajectories of P5ab (A,B) and P5abc 
(C,D) with loading rates 20 and 1000 pN/s. Curves (superpo- 
sition of 100 curves per figure) are represented by 100 points 
with the equal time interval; for clarity we do not connect 
them with lines. 



the addition of their potentials is a two-level system (see 
the respective insets in the figure). Therefore, the two- 
level system, although is a good approximation in RNA 
folding study, should not be simply copied to the force 
unfolding cases. 

Free Energy Difference Estimators Although 
Jarzynski's equality has many applications, our under- 
standing of its behavior is still rather limited |49|) . For 
example, we are not clear whether other free energy es- 
timators are better than the equality, and whether the 
number of repeated trajectories in landscape reconstruc- 
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FIG. 5 Comparison of the free energy landscapes of the two 
molecules P5ab and P5abc reconstructed from the Jarzynski 
equality with two loading rates 20 and 40 pN/s and the ex- 
act landscapes calculated from the partition function method. 
The number of trajectories for each case is 1000. The insets 
are the free energy landscapes of the system composed of the 
molecules and the light trap potential, which are from parti- 
tion function method. Note that we do not show the scales of 
the extensions and free energies. 



estimator is more or less better than the JE as the ex- 
tension z > 415 nm. This trend is more apparent as 
P5ab is unfolding with smaller loading rate 20 pN/s. In 
contrast, the JE estimator for P5abc is superior to the 
FD estimator over the entire extension range at the two 
loading rates. 
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tion above is enough or excessive to achieve reasonable 
precisions. Recently such discussions have attracted con- 
siderable interests pa l50t l51|l . Our simulation here pro- 
vides a good opportunity to numerically investigate these 
questions. 

Instead of the molecular free energies, we will use the 
systematic free energies (the molecule and the optical 
trap) for simplicity. Hence we define Jarzynski estima- 
tor AGje{z) — —f}~ x log(exp(— (3w z ))n (we here replace 
time t by the whole extension z because of the linear rela- 
tion between them) and AGje(z) = Gje{z) — Gje(z )- 
Two other common estimators that can be used to calcu- 
late the free energy differences are the mean work estima- 
tor AGmw(z) = (w z )n and the fluctuation-dissipation 
theorem estimator AGfd(z) — (w z )n — flo^, where a w 
is the standard deviation of the work distribution lf52Th 

To get an intuitive observations about the three esti- 
mators, we calculate the free energy differences between 
the estimators and the exact energies, AGi(z) — AG(z), 
i =MW, FD, and JE. The differences of P5ab and P5abc 
with the loading rates 20 and 40 pN/s respectively are 
showed in Figs. EJA and B; here we choose N=1000. 

There are two common features in the figure. First is 
that the free energy differences for each estimator are not 
uniform along the distance. For example for JE estima- 
tor, the differences are maximum around the unfolding 
distances such as 415 nm for P5ab. Therefore we con- 
clude that noncquilibrium behaviors of the same molecule 
are not uniform along its extension, even if the RNA is 
unfolded with the same loading rate. Then for both the 
molecules, the JE estimator is always better than the 
MW at any loading rates. For the P5ab case, the FD 



FIG. 6 A and B. The differences between the three free 
energy estimators and the exact energies for P5ab and P5abc 
at two loading rates 20 (closed symbols) and 40 pN/s(crossed 
symbols): MW circle and plus, FD square and times, and JE 
diamond and star. Here N=1000. C and D. Histograms of the 
dissipated works at extension z = 430nm for P5ab and P5abc 
molecules at the two loading rates. The lines are Gaussian 
functions with mean and variance from the same data, where 
the dotted lines are for 20 pN/s, and the dashed lines are for 
40 pN/s. 



Because the above analysis is carried out at a given TV- 
value, we do not consider the errors caused by different 
samples. In addition, we also do not consider how the re- 
sults depend on the number of trajectories. Ref. |49l) has 
studied the issues in two extreme situations: the large N 
limit and the system in the near-equilibrium regime. Be- 
cause of N < 100 in the current real experiment, in the 
present work we investigate them in the middle value of 
N, although it is not hard to run 10 5 trajectories in our 
simulation "experiment" . Previous analysis has found 
that different from the case far from equilibrium, the 
properties of the three estimators in the near-equilibrium 
regime are independent of concrete models. Hence the in- 
vestigation should provide a good opportunity to test the 
correctness of the unfolding kinetics we designed. 

Ref. |49l) has introduced three important properties 
associated with the estimators: bias Bi(z) = (AGi(z) — 
AG(z)), which represents systematic error created by the 
finite N; variance af{z) = ((AG l {z) - (AG,(z))) 2 ) ac- 
counting for statistical error because of different samples, 
and mean square error (MSE) , a standard measure for the 
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quality of an estimator, MSEi(z) = a 2 (z) + Bf (z) , where 
i=MW, FD and JE. Although these quantities depend on 
the distance z, we choose a representation at z=430 nm 
for simplifying the discussion below: when the extension 
z is greater than the unfolding distance, the qualitative 
behaviors of the quantities are almost the same. The 
biases and variances dependence on N for the JE esti- 
mators for P5ab and P5abc are plotted in Fig. [7^ and 
B, and the MSEs dependence on N for the three esti- 
mators are in Fig. and D, where two loading rates 
are 20 and 40 pN/s, respectively, and each symbol is the 
average of 100 sets. 

For P5ab case, we note that both the biases and 
variances of the JE estimator at the two loading 
rates can be well approximated with power func- 
tions. For example, at the loading rate 20 pN/s, the 
bias B JE (A30) = (w dls )(430)/N a = 1.57/iV - 55 , and 
<J 2 JE {Am) = al/Nf = 2.86/7V ' 65 , where the dissipated 
work Wdis(z = 430) is defined as w z — AG(z), and 
a w = ((' w z — (w z )) 2 )', they also are the bias and variance 
of the JE estimator with N = 1. This interesting obser- 
vation can be understood from the distributions of the 
dissipated works. Fig. shows their histograms. We 
find that they both agree well with Gaussian functions 
whose means and variances are obtained from the corre- 
sponding data. Indeed, according to the force-extension 
curves, we argued that the unfolding of P5ab at load- 
ing rate 20 pN/s is near-equilibrium; see Fig. [3J\. The 
good agreement between the histograms and Gaussian 
function therefore is not unexpected: when a system is 
in the near-equilibrium regime, it always have a Gaus- 
sian dissipated work distribution, and in particular, an 
important equality holds, a w = 2/3 -1 (wdiss) (52); here 
2.86 « 2 x 1.57. Another demonstration of P5ab in the 
near-equilibrium regime is that the MSEs of the three 
estimators obtained b y o ur simulations consist with the 
following expressions I|49tl5ll) : 



that at the loading rates we presented, although the vari- 
ances can still be approximated with power functions, 
e.g., a 2 JE (A30) = 13.46/7V 9 for P5abc at 20 pN/s, no 
such functions present in the biases. The discrepancies 
of the histogram and the Gaussian function of P5abc in 
Fig. Et* also reflect it. Even if no analytic results have 
been obtained except the large N limit l|4(*l5(Tr). our sim- 
ulations still give some hints about the properties of the 
three estimators in the far-from-equilibrium regime: JE 
estimator should be the best among the estimators; while 
FD and MW estimators be almost equally poor in this 
regime. 
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FIG. 7 A and B. Biases and variances dependence on the 
number of trajectories for P5ab and P5abc molecules at two 
loading rates 20 (closed) and 40 pN/s (crossed) at the exten- 
sion z = 430 nm. C and D. Square roots of the MSEs for the 
three estimators for the two molecules at the same extensions: 
MW estimator square, times and solid line; FD estimator cir- 
cle, plus and dotted line; and JE estimator diamond, star and 
dashed line. 
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see the lines in Fig. [7p. We firstly see that MW estima- 
tor is the worst among the three estimators. Although 
it is known that the FD estimator in near-equilibrium 
regime is unbiased, for smaller N, here about < 10, the 
JE estimator is still superior to FD. This result is from 
the error in estimating <rj from limited data. When N is 
larger, FD estimator is the best among them. Our obser- 
vations are coincident with the previous analysis (|4QT ) . In 
addition, we also see that for the case of 40 pN/s, larger 
number of trajectories will be required to get the same 
quality of any estimator. 

So how about the estimators far from equilibrium? The 
same properties are showed in Fig. [7)3 and D. We see 



2. Constant force ensemble 

Compared to the general thermodynamics of RNA un- 
der force in equilibrium, single-molecule methods are 
more interesting in kinetic folding and unfolding stud- 
ies. With the single molecule experiments we can fol- 
low the actual folding or unfolding trajectories of a sin- 
gle molecules on high resolution even when they occur 
in equilibrium state, which will shed light on the dif- 
ficult kinetic folding problem <|28t l29|h We mentioned 
above that the extension of P5ab in the constant ex- 
tension ensemble may hop back and forth between two 
states. To investigate this bistability, Ref. (0) imposed 
a constant force on P5ab (0). They found that, when 
the force was held constant at the transition within ~ 1 
pN, P5ab switched back and forth with time from the 
folded hairpin (hp) to the unfolded single strand (ss). A 
two-state kinetics was proposed to explain the intrigu- 
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ing phenomenon. The rate constants for unfolding re- 
action can be fit to an Arrhenius-like expression of the 
form k u (f) oc exp(/Ax£ /fc^T), where Ace* is the ther- 
mally averaged distance between the hairpin state and 
the transition state along the direction of force. A similar 
expression also holds for the folding rate kf(f). Appar- 
ently, this description can not clarify the physics under- 
lying the folding and unfolding reactions. 

Because our simulation is based on the microscopic in- 
teractions, we are interested in whether the similar time 
traces can be obtained by simulation. Wc arc not ready 
to choose the direct move sets in Eq. U instead another 
reasonable move set was proposed l|28|) to enhance sim- 
ulation efficiency. Because what we concern about is ki- 
netic behaviors of single RNA, for convenience the contri- 
bution of double-stranded handles is neglected. Indeed, 
under constant force the handle can be viewed as one part 
of the feedback mechanism. If we model single-stranded 
part of RNA structure Si as an extensible freely joined 
chain, the elastic free energy contribution of it under 
force / then is W ss (rii, f) = JiikgTbss/PsS In [smh(u)/u], 
where u = l ss f /ksT. Therefore Eq. [21 is largely simpli- 
fied in such ensemble as 



AG° -W ss {n u f). 



(18) 



We see that the extensional variables are absent from 
systematic energy. Note the extension we record in sim- 
ulation now is x ss {f,rii). Such simplification requires 
that under mechanical force the conformational relax- 
ation process of the single stranded part of RNA is faster 
than the slowest process of the secondary structural ar- 
rangement. Our discussion shows that in contrast to the 
constant extension ensemble, the RNA secondary struc- 
ture S alone can completely specify any state of the con- 
stant force ensemble. Therefore, the move set is the same 
with the set for RNA folding without force, i.e., the un- 
folding space is C(l). 

we recalculate the force-extension curves for the three 
RNA molecules in equilibrium; see Fig. |2 inset. Except 
the regions before transitions where the elastic property 
of the handle dominates, the shapes of the curves and the 
values of unfolding force obtained by the two different 
simulation methods are almost the same. 

We record the extension-time traces of the RNA 
molecules at different constant forces in equilibrium. In 
order to compare with real data, ionic correction are took 
into account (see the third corrections in Table I). For 
example, one extension-time traces at force 14.8 pN for 
P5ab without Mg 2+ is shown in Fig. EJA The exten- 
sion of the molecule jumps between two values, ~ 5 nm 
and ~ 22 nm around the unfolding force. Because the 
jumps are extremely rapid with respect to the lifetimes 
of the molecule in the two states, we simply classify the 
states whose extensions are larger than 15 nm as the sin- 
gle stranded states, and the others as the hairpin states. 
In addition, there are significant fluctuations about the 
two states. Around the unfolding the frequencies of the 
different lifetimes at the single stranded state and the 
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FIG. 8 Simulation for RNA p5ab kinetics. A. Extension ver- 
sus time traces of the molecule at a force in equilibrium, here 
the unit of time is t . Frequency distributions of the lifetimes 
of the single stranded (B) and hairpin states (C) at 14.2 pN. 
The average lifetimes of these two states in this simulation 
can be obtained by fitting to exponential functions; see the 
curves therein. 



hairpin state can be obtained by a long time simula- 
tion (in order to get enough data, the simulation time 
is 10 9 To after equilibrium). Fig. 0J: shows the frequency 
distributions of a typical simulation at force 14.2 pN. 
These distributions can be fit to an exponential func- 
tion oc exp(— t/(Ti)) very well, where (rj), i = u,f denote 
the force-dependent average lifetimes at the two states, 
respectively. For example, the average lifetimes in this 
simulation are (r u ) w 6.2 x 10 4 r o and (r/) « 3.5 x 10 4 r o . 
We calculate all average lifetimes near the unfolding force 
of P5ab, and their corresponding values with different 
forces are shown in Fig. |5J We find that the logarithms 
of the lifetimes for the two states are strikingly consistent 
with linear functions of the forces. Because the reaction 
rate constants are the inverse of the average lifetimes, 
we fit t by making (r u )(f*) = (t/)(/*) equal to the ex- 
perimental value 1/fc*, where k* = k u = kf, and had 
Tq 1 = 2.6 x 10 5 sec -1 . Using the same method, the reac- 
tion rate constants for P5ab in the presence of Mg 2+ are 
also calculated. A comparison of the simulation results 
and the experimental data is listed in Table II. Because 
our simulation does not need additional fitting parame- 
ters, the striking consistence of our results with the ex- 
periment assures us that the RNA folding and unfolding 
model proposed here has grabbed the main physics. 



IV. DISCUSSION AND CONCLUSION 

Compared to the enormous kinetic simulations of the 
force unfolding proteins, the effort contributed to RNA 
is relatively little. To fit the gap, we developed a kinetic 
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TABLE II Simulation results for P5ab compared to the experimental data from Ref. (0) (in bold). 

Molecule (Ax){nm) f* (pN) lnfc f (/)(s~ 1 ) mfc^/Xs" 1 ) 

P5ab, Mg +2 19 ± 2 14.5 ± 1 41 ± 1.9 - (2.8 ± 0.1)/ -39 ± 2.3 + (2.9 ± 0.2)/ 

P5ab, by Cocco et al. 15.1 27.5 - 2.74/ -42.9 + 1.93/ 

P5ab, by us 20.0 14.7 39.4 - 2.6/ -30.1 + 2.2/ 

P5ab, EDTA 18 ± 2 13.3 ±1 37 ± 4.0 - (2.7 ± 0.3)/ -32 ± 4.8 + (2.6 ± 0.4)/ 

P5ab, by us 20.0 14.2 35.7 - 2.4/ -28.3 + 2.1/ 
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FIG. 9 Logarithm of the average lifetimes of single stranded 
and hairpin states for p5ab molecule at difference forces 
around the unfolding. The time is in unit r , which can be 
obtained by fitting to experimental data. Note the slopes of 
ln(r/) and ln(r u ) are independent of the fitting value r . The 
crossed symbols are for the presence of Mg 2+ . 



stochastic simulation to the force unfolding single RNAs. 
Different from previous force unfolding method, for the 
constant extension ensemble external time-dependent 
force can be taken into account correctly. We make use 
of the algorithm to investigate the experiment testing 
the important Jarzynski's equality (0). Instead of the 
force versus extension trajectories used in the experi- 
ment, the extension versus time trajectories are used to 
reconstruct the free energy landscapes, compare the qual- 
ities of the three estimators, and investigate the estima- 
tors dependence on the trajectories number. For the con- 
stant force ensemble, we particularly studied the inter- 
esting extension-time traces dependence on force, which 
would be relevant to RNA folding dynamics. 

The most advantage in study of force unfolding RNAs 
is that the knowledge accumulated in the past forty 
years for RNA secondary structure provides a solid 
fundament for theoretical predictions (including kinetics 
and thermodynamics) in practice. Therefore, we believe 
that our model would be useful in RNA biophysical 
studies in the future. Of course several improvements 



still can be added in our algorithm, e.g., adding the 
effects of Mg +2 to include complicated tertiary interac- 
tions (@). Recent works have shown that the inclusion of 
pseudoknots is possible ijHflU H^h. 

The computation of this work was performed on the 
HP-SC45 sigma-X parallel computer of ITP and ICTS, 
CAS. We thank Dr. Flamm for providing us his computer 
program KINFOLD. F.L. thanks Dr. F.Ye, R.-L. Dai, 
and Y. Zhang for their supporting in the computation. 
This research was supported by the theoretical physics 
special fund, NSFC. 
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